Error analysis of nuclear mass fits 
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We discuss the least-square and linear-regression methods, which are relevant for a reliable de- 
termination of good nuclear-mass-model parameter sets and their errors. In this perspective, we 
define exact and inaccurate models and point out differences in using the standard error analyses for 
them. As an illustration, we use simple analytic models for nuclear binding energies and study the 
validity and errors of models' parameters, and uncertainties of its mass predictions. In particular, 
we show explicitly the influence of mass-number dependent weights on uncertainties of liquid-drop 
global parameters. 
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INTRODUCTION 



Mass or binding energy is one of the most fundamental 
properties of atomic nucleus. Measuring and modelling 
nuclear masses has been since many years, and still is, at 
the center stage of nuclear physics, see Ref. [l[ for a re- 
cent review. Determination of mass from first principles, 
viz. quantum chromodynamics, is extremely difficult and 
only possible in lattice QCD for composite particles like 
mesons or nucleons .2], and is beyond anything possi- 
ble or sensible for nuclei. For light nuclei, one can quite 
accurately calculate nuclear masses by using many-body 
technics that employ parametrized models of nucleon- 
nucleon (NN) and NNN interactions, see e.g. Ref. [3[. In 
these so-called ab initio models, parameters are partly 
fitted to other observables than mass (like NN phase 
shifts) and partly to masses (NNN interactions). There 
are many other, less sophisticated methods to calculate 
nuclear masses, and all of them include fitting to mass 
data to a larger or smaller extent. Therefore, there is an 
extensive history of mass fits in nuclear physics. 

Nevertheless, and strangely enough, the history of er- 
ror analyses of these mass fits is virtually nonexistent 
(but see notable examples in Refs. 0, [f|). As a con- 
sequence, there exist in the literature very many mass 
tables and mass predictions, but there are no estimates 
of the reliability of these results, which would be based 
on thorough methods of analyzing their uncertainties. 

In the present study, we aim at (i) recalling the well- 
known methods that must be used to analyze errors along 
with any fits of parameters, and (ii) pointing several par- 
ticular features of such analyses that are characteristic in 
applications to mass fits. At present, one cannot over- 
estimate the importance of quantitatively analyzing the 
predictivity of mass calculations when applied to exotic 
nuclei far from stability. However, such mass calculations 
must be accompanied by predictions of their theoretical 
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error bars. On the one hand, professional error analyses 
will put predictions on firm grounds — often showing ex- 
plicitly that such predictions are simply impossible, when 
they are based on a given model fitted to a given set of 
masses. On the other hand, they will give quantitative 
information on how much a measurement of mass of the 
last available isotope (often very difficult) will improve 
predictivity of models. 

As a benchmark number that characterizes mass fits, 
one has the mass root-mean-squared (rms) deviation, 
which nowadays does not go below about 0.6 MeV [H, H, 
7]. Down to this level, nuclear models were successfully 
used to describe nuclear masses, and moreover, they of- 
ten correctly describe other observables like charge radii 
and other ground-state properties and excitations. In 
the present study we do not enter into the discussion 
of which observables, apart from mass, should be used 
to fit given models to data. Of course, error analyses 
should be performed when fitting any kinds of observ- 
ables, although our particular example below concerns 
only a mass model. 

In particular, the best Skyrme and Gogny energy- 
density-functional (EDF) methods [8], fitted to large 
numbers of nuclei, have resulted in rms deviations of 0.7- 
1.0 MeV from experimental masses. The deviations from 
experiment are not random, but show systematic pat- 
terns [9]. These patterns are a clear sign that the func- 
tionals are too simplified, see also Ref. [l(|. Systematic 
methods are needed to improve EDF models by introduc- 
ing new terms (for example, by usin g de nsity-dependent 
coupling constants, see e.g. Refs. [ITI.Tl2|. or higher-order 
derivative terms [13j ) and testing the importance and 
physical feasibility of the new terms. 

Current EDF models typically use 10-14 parameters 
or coupling constants. Skyrme functionals, for example 
have quite clear physical interpretation for all of the pa- 
rameters of the functional. If the number of model pa- 
rameters is drastically increased, the meaning and im- 
portance of parameters might not always be clear. To be 
able to understand the significance of each parameter, 
clear and efficient methods must be used, as discussed in 
the present study. 



2 



II. METHODS OF REGRESSION ANALYSIS 

In this section we briefly recapitulate methods used in 
the standard linear regression method [lij]. Along with 
presenting necessary definitions and main results, we also 
discuss several aspects that are specific to our particular 
problem of nuclear mass fits. 

Let us assume that we use a model describing j = 
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To find an optimal set of parameters, a fitting procedure 
has to be used, whereupon the rms deviation (including 
in regression analysis a l/(m — n) normalization) 
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between experimental values of observables, e° xp , and the 
observables given by model is minimized by adjusting 
the model parameters. This is called the least square 
fitting procedure. As is usually the case, the number 
of observables is larger than the number of parameters, 
m > n. 

Each term in the sum of Eq. ([2]) is multiplied by a 
weight factor Wj > 0. In this respect we can single out 
two limiting situations, of an exact and an inaccurate 
model: 

• The model of Eq. {T]) is exact and deviations in 
Eq. (|2|) result solely from imprecisely measured 
experimental values. In this case, one takes the 
weights Wj = (Aej) 2 , where Ae 3 are experimen- 
tal variances of observables ej . 

• The model of Eq. ([1]) is a poor approximation of re- 
ality and deviations in Eq. ([2]) are much larger than 
the experimental variances of observables. In this 
case, the choice of weights is quite arbitrary and 
can only be based on intuition. By using differ- 
ent weights one can, in fact, differentiate between 
importance of various observables in determining 
the model parameters. It is clear that the result of 
adjustment may crucially depend on the choice of 
weights. 

In the nuclear mass fits discussed in the present paper, 
we are obviously in the case of an inaccurate model, by 
which typical experimental errors are of the order of a 
few tens of keV [15|, but can also be as low as about 
100 eV [lfjj while average deviations of mass models do 
not go below about 0.6 MeV [l|. In case of several differ- 
ent kinds of observables included in the fit, dependence 
of the results on weights is obvious, see e.g. recent com- 
prehensive analysis in Ref. Q. However, even if only 
nuclear masses are fitted, the 'natural' choice of weights, 
Wj = 1, is only a choice, and many other choices are pos- 
sible, e.g. depending on whether one wants to put more 



weight into measured values of light or heavy, or stable or 
exotic nuclei. We illustrate this point in Sec. IIIII below. 



A. Determination of parameters 

The function ([2]) has an extremum when all its partial 
derivatives with respect to the model parameters Xi are 
simultaneously zero, 
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0, 
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These partial derivatives are in general non-linear func- 
tions of the model parameters; thus to get manageable 
equations, Eq. ([1]) has to be linearized, i.e., 
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dfj 

dx. 



(xi - i ,t) • (4) 



For observables related to total or single-particle energies, 
the non-linearities can actually be quite small [J, 1 1 01 ] , but 
in general this is not a case and the linearized equations 
have to be solved iteratively. 

We now introduce the notation that xq is the set of pa- 



rameters from previous iteration, by which Xi 



is the 



change of parameters to be determined. We also denote 
the weighted deviations of observables from experiment 
by Vj, 



(5) 



and the weighted matrix of regression coefficients is de- 
noted as 



for 



31 ~ I dXi 



(0) 



(7) 



Then, Eq. @ can be written as 



A^ = 

"rms 



— X! wl J iii x i-4)-yj) 

3=1 \i=l / 



and Eq. ([3]) takes the form: 

(J T J) {x-x ) = J T y. 



(8) 



(9) 



It is now obvious that the parameters lying in the null 
space of J T J (if it is singular) cannot be determined. 
Moreover, during the fitting procedure it often happens 
that some parameters are very poorly determined by the 
experimental data. These parameters should be removed 
from the set because they have very large uncertainties 
and, if kept, would destroy the subsequent error analysis 
(see below). The poorly determined parameters can be 
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found by first transforming to a new set of parameters, 
here called 'independent parameters' and then eliminat- 
ing all non-important independent parameters from the 
fit. 

This can be achieved by making a singular value de- 
composition (SVD) of matrix J, 

jji = u ik w * v ki > (io) 

k=l 

where columns of the to x q matrix U are orthogonal 
(U T U = 1), columns of the n x q matrix V are also 
orthogonal (V T V = 1), and q positive numbers w k are 
called singular values of J. Note that for singular matrix 
J T J one has q < n, and the vanishing singular values do 
not contribute to the sum in Eq. (fTU|) . 

The SVD of J allows one to calculate the inverse 
(J T J) outside the null space of J T J = Vw 2 V T , 

(jTjyi = v J_ vT) (n) 

and the solution of Eq. ([9]) can now be expressed as 

x-x Q = (JTJV 1 J T y = V-U T y. (12) 

v ' w 

The new independent parameters are now defined as 
z = V T x. If some singular values become very small, the 
associated variables are simply dropped from Eq. (fT2")l . 
i.e., 

z k - z Q , k = ± J2"/=i UkjVj for w k > e , , 13 n 
= for w k < e , V ' 



and the new parameters Xi become 

x % = x ,» + Vik ~ Yl U kjVi ■ ( 14 ) 

Wk>£ Wk j=l 

These new values can now be used to continue iterations. 



B. Error estimates 

After the iteration has converged, one can determine 
error estimates for the obtained parameters The 
method used here follows the standard multivariate re- 
gression analysis [18, 19] Assume that we take the scaled 
experimental observables and perturb them with a ran- 
dom noise that has zero mean value. The true experimen- 
tal energies can now be thought of as being random vari- 
ables but only one sample that has the values y/Wje** 9 is 
known. The deviation of each model parameter Xi from 
its mean can then be calculated from Eq. (JTSJ) as 

xi (xi) = j2 (o^r 1 jT ) i . - (w» • ( i5 ) 



Then, the correlation matrix of parameters Xi and Xy 
becomes 



({xi - <*,» {xe - (^)))=EE ( J ( jTj ) _1 ) rt (( jTj ) _1 jT ) i , j , <fa " fo» (vr - (y 3 >)))=sLs (J T J)^ > (ie) 

3 j' 
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where 

^rms t<x/2,m — n^rms (1*0 

and t a / 2 ,m-n is Student's t-distribution [2(| for to — n de- 
grees of freedom, necessary here because of small sample 
size. In Eq. (I16p we have assumed that yj are independent 
random variables whose cross expectation values vanish 
and all have the same standard deviation, i.e., 

<(% - (Vi)) (Vi> - (%'))) = %"5rms ■ (is) 

The average values of parameters, (xi), are determined 
by the least square fitting procedure, (xi) = xo,i- It is 
also assumed that the least square fitting gives an accu- 
rate estimate of the standard deviation of the observables 
ej. With these assumptions, from Eq. (JTHJ) we get the 



following formula for the confidence interval of Xi with 
(1 — a) probability: 

Axi = yj (( Xi - {xi)) 2 ) = S IIUS yJ (J T J)^ . (19) 

It is now clear that small SVD values that appear in the 
inverse matrix of Eq. (jTTJ) spoil confidence intervals of all 
parameters, and have to be removed, as in Eq. (|13[) . One 
should observe that Eq. (fl9|) does implicitly depend on 
the weights through the definitions of Eqs. (O, (|6|), and 

We have to stress at this point that the error esti- 
mates of Eq. (fT9|) have quite different meaning for the 
exact and inaccurate models discussed at the beginning 
of this section. In the first case, errors of parame- 
ters result solely from the statistical noise in measured 
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observables — variances thereof are supposed to be known 
and define wcig hts in Eq. © as Wj = (Ae 3 -) . There- 
fore, within the exact model, the assumption of equal 
variances, Eq. (|T5j) . is well justified. Such model then 
gives the minimum value of A^ ms near 1, which is the 
so-called x 2 test. 

For an inaccurate model, the error estimates of 
Eq. (|19p only give information on the sensitivity of the 
model parameters to values of the observables. They cor- 
respond to the situation where the experimental values 
are artificially varied far beyond their experimental un- 
certainties, so as to induce tangible variations in values 
of parameters. Eq. (| 18[) then means that the range of 
this variation is inversely proportional to WWj, i.e. it is 
commensurate with the importance attributed to a given 
observable. Here, the error estimates may depend on the 
weights, and thus are affected by their choices, similarly 
as the values of parameters are. 

We are now in a position to discuss very important 
aspect of the mass fits, namely, the mass predictions and 
error propagation. Suppose that wc apply the model of 
Eq. Jl} not only to the measured masses but also to the 
masses of unknown nuclei, 

e j = f j (x), (20) 

where the tilde means that the set of observables ej in- 
cludes not only those used for the fit, j — 1, . . . , m, but 
also many other ones, j = m + 1, . . . , M. 

The error estimates of Eq. (TT5]) allow us to estimate un- 
certainties of the predicted observables. With the same 
assumptions as before, but now using the parameters Xj 
from the least square fit both for observables inside and 
outside the fitted set, we get 

( g j - (ej)f X! ( Xi ~ fa*)) ( Xi> ~ ( afi ')) ' ( 21 ) 

ii' 

where Iji are the regression coefficients, Eq. (J7J), of ob- 
servables e.j with respect to the model parameters Xj. 
Then, the confidence intervals of predicted observables 
become 

Ae j = yjdij (g,)) 2 ) = 5 imB J{IiJ^Jr r p)~ j , 

"(22) 

where we have used Eq. (|16p . 

Equations (|19[) and (f2"2"|) form the basis of the error 
analysis of our mass fits. The calculated error bars (fl"9|) 
of parameters Xj must then be further scrutinized to ana- 
lyze which parameters are necessary and which should be 
removed from the model. The confidence intervals (f2"2")l 
constitute estimates of predictivity of the model. Note 
that they should also be calculated for the observables 
that have actually been used in the fit. It is these in- 
tervals, and not the residuals yj/ y/Wj, which have to be 
analyzed when discussing the quality of the model. In- 
deed, it is obvious that the residuals can be arbitrarily 
small for some observables, or for some types of observ- 
ables (e.g., masses of semimagic spherical nuclei), while 



the model can still be quite uncertain in describing these 
same observables. 



III. EXAMPLE APPLICATION 

To illustrate the fitting and error analysis techniques 
of the previous section we use them within a simple nu- 
clear mass model. The model expresses nuclear binding 
energy as a sum of the liquid drop (LD) and shell en- 
ergies |2~i| . The LD energy we use closely resembles the 
Myers-Swiatecki LD formula [22| with symmetry terms in 
volume and surface energy parts and a modified Coulomb 
part. It has the form 

E LB (N,Z) = a v A + a s A 2 / 3 + a v ,s ym I 2 A (23) 

r2 a2/3 Z\Z 1) P 

+ as jSym J A' + a c — -jjyg h ap ^ , 

where I = (N - Z)/A and 2P = (-1) N + {-l) z . The 
shell energy is modelled by polynomials of N and Z: 

+ Xi^n 2 + xi^nz + XifiZ 2 

+ Xijn 3 + Xi^n 2 z + Xi,gnz 2 + Xi^oz 3 

+ x^nn 4 + x l! i 2 n 3 z + x h i 3 n 2 z 2 

+ x lM nz 3 + x l; i 5 z 4 , (24) 

where z — Z — Zi and n — N — Ni. The index i enumer- 
ates 15 different rectangular areas on the nuclear mass 
chart delaminated by magic numbers, see Fig.Q] In each 
such an area, N and Z values are between given magic 
numbers Ni and Zi. We restrict parameters of polynomi- 
als (|24|) in such a way that the shell effects be continuous 
across magic proton and neutron numbers, however, the 
derivatives thereof can be non-continuous. In this way 
the model can produce the binding-energy cusps at magic 
nucleon numbers. 

The continuity requirements impose 19 conditions at 
semimagic nuclei, see Fig.[TJ Each condition results in p+ 
1 linear equations for Xj, where p is the polynomial order. 
Thus for the second-, third-, or fourth-order polynomials 
(p=2, 3, or 4) we get (p + 1) • 19=57, 76, or 85 equations 
for 90, 150, or 225 parameters, respectively, resulting in 
33, 74, or 130 independent variables of the shell energy, 
Eq. (f2"4")l . Together with the six parameters of the liquid- 
drop energy, Eq. (j23|) . the model thus contains 39, 80, or 
136 independent parameters. 

It should be noted that the model described above is 
fully linear. This means that the iteration procedure con- 
sists of just one step, because matrix J is then constant 
and the convergence is obtained after just one iteration. 
In this respect the simple model considered here does 
not accurately resemble realistic EDF models. However, 
it allows us to test and showcase all the error analysis 
methods that can also be used in realistic nonlinear EDF 
calculations. 
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FIG. 1: (Color online) Areas of nuclear mass chart where the 
shell energy polynomials of Eq. (|24p are defined. The black 
dots mark lines of semimagic nuclei for which the shell energy 
polynomials of adjacent rectangles are constrained to have the 
same values. The numbers in the rectangles show how many 
nuclei in the given area was used in the fit. Semimagic and 
magic nuclei belong always to the rectangle to the right and 
up. 



We used the 1995 mass evaluation of Audi and Wap- 
stra (l5| as our experimental nuclear binding energies. 
These masses are outdated, but they serve us only for 
illustrative purposes. The full model with fourth-order 
polynomials was fitted to m = 2844 experimental and 
extrapolated binding energies of nuclei with A > 16, and 
the resulting set of parameters was used to create meta- 
data masses that approximate the experimental masses 
with the rms deviation of 1.1 MeV. In this way, we have 
constructed the dataset of masses, which is exactly de- 
scribed by the n = 136 parameters of the full model. 
Values of the LD parameters used to define the meta- 
data are listed in Table |U 



Parameter 


Defining 


Fitted 


Error 




value 


value 


estimate 


av 


14.9455 


14.9455 


0.0008 


as 


-14.9326 


-14.9325 


0.0024 


V, sy m 


-22.3303 


-22.3293 


0.0053 


flS,sym 


7.5995 


7.5965 


0.0068 


ac 


-0.65709 


-0.65708 


0.00005 


a P 


11.3655 


11.3633 


0.0187 



TABLE I: Values and error estimates (in MeV) of the LD 
parameters. Values defining the metadata are compared with 
those obtained from fitting the exact model to metadata with 
the Gaussian noise of 0.1 MeV. 

We do not ascribe to the model of Eqs. (|23|) and (|24|) 
any particular physical importance, and we are not really 
concerned with the question of how well it describes the 
experimental data. The model only serves us for the pur- 
pose of creating the metadata, and only these metadata 
are the subject of the consecutive analysis. 



To the metadata given by the fourth-order model we 
add Gaussian noise of a given standard deviation a, i.e., 
random numbers are added to all of the 2844 metadata 
masses. We stress here that we do not construct any 
ensemble of datasets and we do not perform any ensem- 
ble averaging. Indeed; we just have at our disposal the 
same number of 2844 "experimental" metadata points, 
for which we know exactly what are the model and noise 
parameters. Below, the Gaussian noise of a = 0.1 MeV 
is used unless explicitly indicated. 

The main thrust of our study is now at repeating the 
least square fits of the second-, third-, and fourth-order 
models described above. The fourth-order model is exact, 
while the second- and third-order models are inaccurate 
(see the discussion at the beginning of SecQl])- Note that 
only the metadata shell effects are imprecisely described 
by the second- and third-order models — the LD parts of 
Eq. (|23|) have always the same form. 

Our purpose is to study the fitting procedure, values 
of parameters, error estimates, and confidence intervals 
in the situations of exact and inaccurate models. In par- 
ticular, we analyze dependence of the least square fits on 
the weights chosen for the definition of the rms deviation. 
To this end, we chose weights in the form 



Em 



A',; 



(25) 



where Aj is the mass number of the given nuclide and a 
is a parameter. For a — 0, one has a 'natural' choice of 
all weights being equal, Wj = 1 , which is the choice most 
often used in nuclear mass fits. 

However, it is obvious that we can equally well argue 
in favor of other choices. On the one hand, for a = —2, 
the fit would correspond to fitting not binding energies, 
but binding energies par particle, E/A, which may seem 
to be a reasonable choice when discussing the LD model 
parameters. Naturally, this choice simply corresponds to 
putting a lot of more importance in masses of light than 
in those of heavy nuclei. On the other hand, for a > 0, 
heavy nuclei are considered to be more important for the 
mass fits than the light ones, which can be motivated by 
the fact that these nuclei are closer to the infinite-matter 
limit. Obviously, such arguments are as good as they can 
get, but the bottom line is that one has here a freedom 
of choice that depends on personal taste and preference. 
Below, a is varied from —1 to 1, and the value of a = 
is used whenever not explicitly indicated. 

We begin by discussing the influence of the Gaussian 
noise added to the metadata. In Fig. [2] we show depen- 
dence of the rms deviations of the least square fits ([5]) as 
functions of the standard deviation of the Gaussian noise 
a. For the exact model, the fitting procedure reproduces 
perfectly well the standard deviations of the added noise. 
For the inaccurate models, i.e. for the second- and third- 
order polynomial fits, one obtains the rms deviations that 
are higher than the added noise. 

Of course, when the added Gaussian noise goes to zero, 
the rms deviation of the exact model also vanishes. For 
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FIG. 2: (Color online) The rms deviations of the least square 
fits ([8]) as functions of the standard deviation of the Gaussian 
noise a added to the metadata. 

inaccurate models, in this limiting case the rms devi- 
ations level out and converge to about 1.6 and 1.0 MeV 
for the second- and third-order models, respectively. One 
can say that the inaccurate models introduce their own 
intrinsic noises, which are not statistical in nature, but 
represent averaged inaccuracies of the models. One can 
see that at non-zero Gaussian noise, for inaccurate mod- 
els the rms deviations are much smaller than the rms 
of the Gaussian and intrinsic noises. It looks like the 
intrinsic noise is gradually disappearing inside the Gaus- 
sian noise. This is in fact the limit, in which inaccurate 
models become quite good in describing less and less well 
determined experimental data. 

In Fig. [3] we show the distributions of fit residuals, 

Se^efv-frixo), (26) 

obtained by fitting the three considered models to meta- 
data containing the a = 0.1 MeV Gaussian noise. As 
expected, for the fourth-order (exact) model, the dis- 
tribution is perfectly Gaussian with the same width 
of 0.1 MeV. For the second- and third-order inaccurate 
models, the distributions are not only wider, with the 
widths of 1.6 and 1.0 MeV given above, but also do not 
have exactly Gaussian shapes. This again illustrates the 
non-statistical nature of the intrinsic noise within inac- 
curate models. 

Next, we illustrate the problem of eliminating poorly 
determined model parameters, as explained in Eq. (|14[) . 
Figure [4] shows the singular values obtained by fitting 
to metadata the second-, third-, and fourth-order mod- 
els. When the third- and fourth-order polynomials are 
used in the fit, and in Eq. f|14|) the maximum numbers 
of parameters is kept, a number of parameters become ill 
defined. This is because some singular values of matrix J 
become extremely small. As a result, 3 and 14 smallest 
singular values of the fit matrix J must be eliminated 




Residual [MeV] 

FIG. 3: (Color online) Distributions of fit residuals for 
three different polynomial fits to metadata. Bin widths are 
0.1 MeV. 
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Parameter 

FIG. 4: (Color online) Singular values of the fit matrix J of 
Eq. (|10|l when three different polynomial orders are used in 
the least square fit. 



when the third- and fourth-order polynomials, respec- 
tively, are used in the fits to metadata. This elimination 
is a direct result of some redundancy in the model pa- 
rameters, which is obviously the case in those rectangles 
of Fig. [T] where the numbers of experimental data are 
small. 

As can be seen from Fig. [SJ even more unimportant pa- 
rameters could be eliminated from the fits without losing 
significant amount of fit quality. If the second- or third- 
order polynomials are used to represent the shell effects, 
only about 60% of the independent uncorrelated model 
parameters (out of 39 or 80, respectively) are relevant 
and the remaining 40% do not contribute significantly to 
the fit, and can be safely removed. For the fourth-order 
(exact) model this is not the case, and many more pa- 
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Fraction of singular values used 



FIG. 5: (Color online) The rms deviations ([8]) of the least 
square fits to metadata, calculated for the models of Eqs. (123[) 
and (|24l) . as functions of number of singular values kept for 
matrix J, Eq. JTJ}. 



rameters (about 85% of 136) are required to go down to 
the value of the rms deviation equal to O.lMeV, corre- 
sponding to the Gaussian noise in the metadata. 

Figures [6] and [7] present results of fits performed for 
different choices of weights Wj , defined in Eq. (f25|) . Wc 
first observe that fits of the fourth-order (exact) model 
give results that are entirely independent of weights. For 
a = 0, values of fitted parameters and their error esti- 
mates are given in Table Q] Small differences between the 
fitted values and values defining the metadata, and small 
values of errors, illustrate the quite small impact of the 
0.1 MeV Gaussian noise included in the metadata. 

Situation is drastically different for fits of the inaccu- 
rate models. Here, values of the fitted parameters, shown 
in Fig. [5J are not only quite different form the exact ones, 
but also rather strongly depend on the choice of weights. 
It is clear that weights strongly affect the balance be- 
tween the volume and surface parameters. For weights 
giving greater importance to heavy nuclei (a > 0), all ab- 
solute values of volume and surface parameters decrease. 
The effect is particularly large for the surface symmetry 
parameter as. sym , which for the second-order model de- 
creases from about 9 MeV at a = — 1 nearly to zero at 
a = 1. 

Variations of parameters, seen in Fig. [6l are much 
larger than their error estimates shown in Fig. [7J It 
means that the standard way of estimating errors, 
Eq. (|19|) . may give significantly overoptimistic results. 
We stress here once again that the obtained variations in 
the LD parameters are induced by imperfect descriptions 
of shell effects only. One can say that such imperfections 
do contain smooth particle- number dependencies, which 
are then captured by the fitting procedure and get trans- 
ferred to values of the LD parameters. 

One can, in principle, argue that macroscopic (LD) 



and microscopic (shell) effects should not be mixed, but 
rather should be fitted separately to avoid cross-talk 
effects described above. This is certainly possible in 
macroscopic-microscopic models @ that use separate ex- 
pressions and/or methods to describe these two features 
of the mass surface. However, such sep aration induces 
ambiguities on its own, see e.g. Ref. [231 ] . and, moreover, 
it cannot be realized in self-consistent methods, which 
describe the LD and shell effects by the same set of pa- 
rameters. 

In Figs. [8] and [9] we show confidence intervals and resid- 
uals, Eqs. ([2"2")) and ([2"6")h respectively, of the binding ener- 
gies predicted in lead isotopes. For nuclides used in the fit 
(the range denoted by dotted vertical lines), confidence 
intervals and residuals obtained for the fourth-order (ex- 
act) model nicely reproduce the 0.1 MeV Gaussian noise 
included in the metadata. 

Situation is again very different for the inaccurate 
models, which correspond to fitting the second- or third- 
order polynomials. In lead isotopes, residuals of the 
third-order model are still quite small, well below the 
rms deviation of 1.0 MeV, which is the value characteriz- 
ing this fit. It simply means that for these observables, 
the model performs quite nicely. However, the confidence 
intervals tell us that the quality of the model even in lead 
nuclei is not that great as suggested by small residuals. 
For the second-order model, residuals become quite high 
but the confidence intervals indicate that the quality of 
the model does not, in fact, deteriorate. Confidence inter- 
vals and residuals give us diverging evaluations of quality 
of models, because the former represent global character- 
istics, which depend only on the standard deviations of 
parameters, while the latter illustrate only local proper- 
ties of the models. 

An interesting property of the confidence intervals is 
the fact that, for nuclei outside the fit, the confidence in- 
tervals quickly increase, independently of the complexity 
of the model. This result is in accordance with results 
obtained within realistic nuclear mass models, whose pre- 
dictions (for nuclei outside the fit) deviate greatly from 
each other. On the one hand, such an increase of the 
confidence intervals is a reflection of poor predictivity 
of models when they are extrapolated to exotic nuclei. 
On the other hand, the confidence intervals simply quan- 
tify this uncertainty of extrapolation and constitute pre- 
cise measures of the natural fact that such extrapola- 
tions must be uncertain. This is so because the model 
parameters are rather loosely defined by the metadata, 
and therefore, important information is missing from the 
models. 

The discontinuity of confidence intervals at TV = 126 is 
an artifact of the model, which uses different parameters 
in rectangles delimited by magic numbers, see Fig. [1] 
Note that the model ensures the continuity of binding 
energies, but the confidence intervals need not to be con- 
tinuous. 
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FIG. 7: (Color online) Same as in Fig. [6] but for the error estimates, Eq. (|19|l . of the LD parameters. 



IV. CONCLUSIONS 

In the present study, we have pointed out to the neces- 
sity of estimating errors along with estimating values of 
parameters that define nuclear mass models. Such errors 
allow not only for quantifying quality of models in terms 
of confidence intervals instead of fit residuals, but also 
for putting theoretical error bars on mass predictions. 

A crucial element in the error analysis is the fact that 
the nuclear mass models belong to the class of inaccurate 
models, which describe data with accuracy that is much 
lower than that of the data themselves. For such mod- 
els, standard least-square methods to estimate errors and 
values of parameters are not based on statistical assump- 
tions, but rather pertain to analyzing sensitivity of the 
model parameters to data. Consequently, results may, 
and do depend on weights that are used when defining 



the rms deviations between the model results and data. 

The discussion of error analysis was illustrated by using 
a simple mass model that includes a global liquid-drop 
part and a locally fluctuating shell-effect part, with a 
number of model parameters. A set of metadata masses 
was generated by fitting the most complex variant of the 
model with the fourth-order shell-effect polynomials to 
experimental nuclear binding energies. The metadata 
were then used as an " experimental" input for performing 
fits that used less sophisticated second- and third-order 
polynomials. In this way, we had at our disposal the ex- 
act model of the metadata and two inaccurate models 
that mimicked realistic situation in mass fits. 

Within such a scheme, we were able to illustrate many 
properties of nuclear mass fits. In particular, we showed 
explicitly the relations between the statistical noise in the 
metadata and error estimates. We also presented meth- 
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Neutron number 



ods to differentiate between important and unimportant 
model parameters, which are based on the singular value 
decomposition of the regression matrix. By perform- 
ing mass fits with mass-number dependent weights, we 
showed that values of the model parameters may involve 
much larger uncertainties than those given by standard 
error estimates. Finally, we have exemplified the role of 
confidence intervals and fit residuals in evaluating the 
quality of exact and inaccurate models. 



FIG. 8: (Color online) Confidence intervals (99% confidence 
level) of binding energies of the model defined in Eqs. (I23p 
and (|24[) . calculated in lead isotopes using Eq. (|22[) . 
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FIG. 9: (Color online) Same as in Fig. [8] but for the binding- 
energy residuals, Eq. I|26p 
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